Accurate implementation of leaping in space: The spatial partitioned-leaping algorithm 
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There is a great need for accurate and efficient computational approaches that can account for both the dis- 
crete and stochastic nature of chemical interactions as well as spatial inhomogeneities and diffusion. This is 
particularly true in biology and nanoscale materials science, where the common assumptions of deterministic 
dynamics and well-mixed reaction volumes often break down. In this article, we present a spatial version of the 
partitioned-leaping algorithm (PLA), a multiscale accelerated-stochastic simulation approach built upon the r- 
leaping framework of Gillespie. We pay special attention to the details of the implementation, particularly as it 
pertains to the time step calculation procedure. We point out conceptual errors that have been made in this regard 
in prior implementations of spatial r-leaping and illustrate the manifestation of these errors through practical ex- 
amples. Finally, we discuss the fundamental difficulties associated with incorporating efficient exact-stochastic 
techniques, such as the next-subvolume method, into a spatial-leaping framework and suggest possible solutions. 
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I. INTRODUCTION 

In our attempts to understand the behavior of systems at 
increasingly small scales, the importance of random fluctua- 
tions, or noise, is becoming increasingly apparent. Indeed, the 
phenomenon is the subject of great interest in a variety of di- 
verse fields, including cellular biology GJ-Q], semiconductor 
processing HHt and heterogeneous catalysis ifioll . 

From a computational perspective, incorporating the ef- 
fects of stochasticity into models of physical processes re- 
quires moving beyond traditional continuum-deterministic ap- 
proaches, such as ordinary differential equations (ODEs), and 
using one of a variety of stochastic methods. Within the 
purview of chemical kinetics, a popular technique is Gille- 
spie's stochastic simulation algorithm (SSA) 111 1 1 - 4 1 311 - The 
method is extremely accurate, easy to implement and has 
found widespread use in computational systems biology. Its 
downside however, is speed, and the algorithm can become 
proh ibitively slow due to its one-reaction-at-a-time nature 

III El. 

This fact has spawned considerable effort, from a variety 
of directions, to develop methods for overcoming this in- 
herent limitation of exact-stochastic approaches. A particu- 
larly popular type of accelerated-stochastic approach is "r- 
leaping", originally devised by Gillespie lfl5ll and expanded 
upon by numerous investigators Illa - OOll . including ourselves 
113111 - In general, leaping methods have proved quite success- 
ful in overcoming some, but not all, of the problems plaguing 
exact-stochastic simulation methods 11311 . 



All of the methods cited above operate under the assump- 
tion that the volume within which the reactions are "firing" is 



"well mixed." In more precise terms, the assumption is that 
the time scale of diffusion is fast enough so that all entities 
(e.g., molecules) of the same species have equal probability 
of reacting at any given point in time. However, it is not hard 
to imagine situations where this assumption breaks down. In 
solid-state systems, for example, diffusion is much slower 
than in fluids and the local environment seen by a dopant 
atom, say, plays a much larg er role in its dynamics jjj. In 
biology, both eukaryotic l32ll and prokaryotic l33ll cells have 
intricate internal structures that act to localize certain inter- 
actions and processes. The sheer size of cellular components 
also leads to a highly crowded and definitively non-well-mixed 
intracellular environment 04113511. 
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In situations such as these, methods that account for spatial 
inhomogeneity and diffusion are needed. In the extreme case, 
it may be necessary to track the fates of individual entities, 
or "agents" If36l, 13711 . However, a more common situation is 
one where the system of interest can be partitioned into multi- 
ple smaller domains, or "subvolumes." Each subvolume is as- 
sumed to be well-mixed and coupled to neighboring subvol- 
umes via a jump-diffusion processes. Various extensions of 
the SSA have been successfully implemented along these lines 
I38l - l42tl . General overviews of both agent- and subvolume- 
based spatial-stochastic simulation approaches ap plie d in bi- 
ology and materials science can be found in Refs. II43T - I47I1 . 

In spatially inhomogeneous systems, the shortcomings of 
the exact-stochastic approach are intensified. In general, each 
subvolume is given local copies of each reaction and diffu- 
sion event. Thus, the number of possible events in the sys- 
tem increases significantly with increasing number of sub- 
volumes, often making SSA-like methods infeasible. A par- 
tial solution to this problem lies with the leaping methods. 
While the number of events in the system remains unchanged 
(and hence still a potential problem), spatial leaping meth- 
ods achieve accelerations by allowing all reaction and dif- 
fusion events to fire multiple times at each simulation step. 
We are aware of two implementations of leaping algorithms 
along these lines, those of Marquez-Lago and Burrage J48tl 
and Rossinelli et al. l49ll . Marquez-Lago and Burrage propose 
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a method that is a leaping an alogue to the well-known "next- 
subvolume method" (NSM) l40ll4lll . an efficient spatial SSA 
variant. Rossinelli et al. present a more straightforward exten- 
sion of leaping in space that considers reaction and diffusion 
events separately. 

In this article, we present a spatial implementation of our 
own method, the partitioned-leaping algorithm (PLA) l3lll . 
Our implementation is similar in spirit to the methods of 
Marquez-Lago, and of Rossinelli, but differs in some im- 
portant ways. In particular, we take special care with re- 
gards to the calculation of time steps. We point out some 
conceptual errors that were made in this regard in refs. ll48tl 
and 114911 and demonstrate, through numerical examples, how 
these errors may affect accuracy and efficiency. We show 
that, in some cases, the spatial partitioned-leaping algorithm 
(SPLA) is faster than these methods and at least as accurate. 
In other cases, SPLA is slower but significantly more accu- 
rate. In yet other cases there is little difference. We explain 
the origins of this differential behavior and its consequences 
for practical applications of the methods. Finally, we dis- 
cuss the fundamental difficulties associated with incorporat- 
ing exact-stochastic approaches like the NSM into a spatial- 
leaping framework and suggest possible strategies for over- 
coming them. 

In Sec. [n] we present an overview of relevant exact- and 
accelerated-stochastic simulation methods for both homoge- 
neous (well-mixed) and inhomogeneous systems that set the 
stage for the new SPLA approach in Sec. [HI] Sec. |IV| shows 
results from three simple example systems that exemplify the 
gains in accuracy and efficiency achieved by the method. Fi- 
nally, we conclude in Sec. IVlwith a discussion of these results 
and their implications for future extensions of leaping meth- 
ods. 



II. BACKGROUND 

We consider a chemically reactive system of fixed volume 
fly and constant temperature that is partitioned into L well- 
mixed sub volumes V = {Vi, . . . ,Vl}- Each sub volume Vi 
has a fixed volume w; (%2i>=i = ^y) an d is adjacent to 

(< L — 1) neighboring subvolumes C; = {C/i, . . . , Cir^. 
In principle, each V\ contains a unique set of Ni molecular 
species S; = {Sa, . . . , that participate in Mi unique 

reactions R; = {Ru, . . . , RiMi}- We assume that all Ni 
species can diffuse into and out of all T; neighboring sub- 
volumes. Thus, each Vi has NiTi outgoing diffusion events 
D; = {Dn, . . . , DiNiF,} associated with it as well as NiTi 
incoming diffusion events D; = {Dn, . . . , Dm^, }■ It is im- 
portant to recognize that each Di^ is a reference to an outgo- 
ing diffusion event from an adjacent subvolume. All together, 
there are a total of M; + 2N[Ti reaction and diffusion events 
associated with each Vj. We thus define, without loss of gen- 
erality, the event vector E; =R;+D;+D;. 

The state of the system is represented by the vector X(i) = 
J2i'=i X|' (t), where Xu(t) is the population of species Si in 
subvolume Vi at time t, i € {1, . . . , Ni}. Each event chan- 



nel Eifj, is associated with a propensity function a; At (X(i)) 
(the stochastic analogue to the deterministic reaction rate) 
and a stoichiometry vector z; M = {zi^i , . . . , zi^Ni }, H- € 
{l,...,M,+2JViri}(see JH). 



A. Exact-stochastic methods 

1. Well-mixed systems 

Gillespie's SSA operates within a fully well-mixed system 
(i.e., L = 1) flU [l2"ll . The approach determines when the next 
reaction will fire in the system and of which type it will be. 
Two mathematically equivalent approaches were presented 
for accomplishing this: the direct method (see 111311 for de- 
tails) and the first-reaction method. The first-reaction method 
determines when each reaction in the system would fire ;/ it 
were the only reaction present in the system and then chooses 
r as the smallest of these values and fi as the corresponding 
reaction. Such "tentative" next-reaction times are calculated 
via 

T- act = -ln(rv)/a^), (1) 

where r M is a unit-uniform random number. As originally for- 
mulated, the first-reaction method requires M unit-uniform 
random number generations at each simulation step, M — 1 
of which are discarded before proceeding on to the next step. 
An improvement upon this approach is Gibson and Brack's 
next-reaction method (NRM) [13lTl . The next-reaction method 
basically uses a rigorous random-variable transformation for- 
mula to reuse the generated random numbers in the next time 
step. This reduces the number of random number generations 
per time step to exactly one, along with M'—l calculations of 

r^ ct = {a' fl (t)/a ll (t))(r^' -r'), (2) 

where the unprimed and primed quantities signify new and old 
values, respectively. 



2. Inhomogeneous systems 

The direct method and first-reaction method essentially 
constitute two ends of a spectrum with regards to the group- 
ing of reactions. In the direct method, the entire system of 
reactions is basically considered to be one large group. In 
the first-reaction method (and NRM by extension), each reac- 
tion is considered individually, i.e., as a group of one. Thus 
any method intermediate between these two is also a theoreti- 
cally sound approach 11311 . From a practical point of view, this 
means we are free to group reactions into subgroups as we see 
fit. We can then choose among those subgroups using the di- 
rect method or first-reaction method (or NRM or any other 
equivalent method, e.g., ll52l,l53ll ) and then choose within the 
subgroup in the same way. Moreover, we can nest the sub- 
groups into as many levels as we like if we find it convenient 
to do so. 
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Such a procedure has the effect of parsing out the com- 
putational load into multiple stages and can, in many cases, 
significantly improve the efficiency of the method. A well- 
known such approach is Elf and Ehrenberg's next-subvolume 
method (NSM) HolEl], a spatial SSA variant that discretizes 
space into subvolumes and groups events (reaction and dif- 
fusion) based on the subvolume within which they reside. 
The NSM operates by calculating the summed propensities 

aio(t) = Ylvlt Nl ' a iv{t) for a ll subvolumes I £ {1, . . . , L}. 
The subvolume within which the next reaction will fire is then 
identified using a heap search as in the NRM 15 111 and the 
identity of the firing reaction within the subvolume using a 
linear search as in the direct method flUl . This two-level ap- 
proach significantly reduces the computational effort relative 
to a straightforward heap or linear search over all events in the 
domain. 



B. Leaping approaches 

1. t leaping 

As mentioned previously, the primary shortcoming of 
exact-stochastic simulation methods, whether applied to well- 
mixed systems or otherwise, is that every event firing is sim- 
ulated explicitly. This imposes a tremendous computational 
burden on the algorithm, particularly if one or more species 
have large populations. 

To address this problem, Gillespie proposed the r-leaping 
approach, which proceeds by firing multiple reaction events 
at each simulation step lfl5ll . In the well-mixed case, we first 
define the random variable A' At (r) as the number of times re- 
action channel fires during the time interval [t, t+r). The 
time evolution of the system can be formally written in terms 
of this variable as 

M 

X(i + r) =X(i) +^ Zi ,A, y (r). (3) 

The idea then is to calculate some r over which all reaction 
propensities remain "essentially constant". In such a case, the 
reaction dynamics can be assumed to obey Poisson statistics 
and 

K^t) « PMt)<r), (4) 

where 7 : '(a M (i)r) is a Poisson random variable with mean and 
variance a M (i)r. Note that the dependence in Eq. (HJi on the 
value of at the beginning of the step, i.e., at the initial time 
t, makes this an "explicit" approach, analogous to explicit 
methods used in the numerical integration of ODEs lfl7ll . 

Equations (0 and © constitute the essence of the (explicit) 
r-leaping method. At each step of a simulation, a time step 
r is calculated (see Sec. Ill B 31 below) and the system state 
updated by generating M Poisson random deviates {k u (r)} 
in keeping with Eq. ©. Added to this is a proviso that if the 
total number of expected firings, ao(i)r, is "small" (~ 10) 
then some variant of the SSA is used instead 11511 . 



Since its inception, modifications to the r-le aping a pproach 
have been proposed by various investigators iflq-ulll. There 
are recent reviews by Gillespie lfl3ll and Pahle 15411 Though 
differing in various aspects, all of these methods are based on 
the same basic principles encapsulated in Eqs. (fj) and (0). 



2. Partitioned leaping 

In Refs. and JH], Gillespie went beyond Eq. (HJi and 
noted a well-known property of the Poisson distribution that 
it can be approximated by a normal, or Gaussian, distribution 
if the mean is "large." This allows us to write 

Ayr) » V(a^(t)r) « N{a^t)T, o M (t)r) 

= a^(t)T + a^(t)r x JV(0, 1) (5) 

where Af(0, 1) is a normal random variable with mean zero 
and unit variance lfl5l l55ll . Written this way, Eq. © is 
equivalent to the chemical Langevin equation 115511 . a stochas- 
tic differential equation comprised of a "deterministic" term 
and a fluctuating "noise" term. Gillespie then noted that as 
a M (t)r — > oo the noise term becomes negligible relative to the 
deterministic term, giving 

A^(t) a a M (t)r, (6) 

which is equivalent to the forward-Euler method for solving 
deterministic ODEs JHH1. 

In Ref. l3lll . we introduced the partitioned-leaping algo- 
rithm, a r-leaping variant that utilizes the entire theoretical 
framework encompassed by Eqs. ©-(Ell. The partitioned- 
leaping algorithm considers reactions individually in a way 
reminiscent of the NRM. After calculating a time step r 
(Sec. Ill B 3b , each reaction is classified into one of four cat- 
egories: exact-stochastic, Poisson, Langevin and determinis- 
tic. Reactions classified at the three coarsest levels (Poisson, 
Langevin, deterministic) utilize Eqs. ©-(O, respectively. Re- 
actions classified at the exact-stochastic level are handled as 
in the NRM [Eqs. and ©]. Incorporating the SSA into 
the multiscale framework of the partitioned-leaping algorithm 
is thus seamless and simple. Details of the algorithm can 
be found in ref. ll3lll . with a demonstration of its utility in 
ref. 01. 



3. t selection 

The central task in leaping algorithms is the manner in 
which the time step r is determined. Indeed, the entire method 
hinges on the validity of the Poisson approximation Eq. ©, 
which requires that the propensities of all reactions change 
negligibly during r. To quantify this requirement, Gillespie 
defined the "leap condition" lfl5il55ll . 

K(i + r)-<v(t)|/£<e, (0<e«l) (7) 

where £ is an appropriate scaling factor (see below). 
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Three main classes of r-selection procedure have been pro- 
posed: (i) a pre- leap reaction-based (RB) approach that uses 
Eq. directly EH El ED], (ii) a pre-leap species-based 
(SB) approach where changes in the species populations are 
constrained such that Eq. (|7]) is satisfied for all reactions 
[El EH, and (iii) a post-leap checking procedure that explic- 
itly ensures that Eq. (0 is satisfied at all simulation steps 12811 . 
Gillespie's initial r-selection strategy was an reaction-based 
approach with £ = ao(t) lfl5i [l6ll . which we will refer to as 
RB-a . More recently, Cao et al. Il9ll proposed an improved 
reaction-based approach with £ = Ou(t), which we will refer 
to as RB-a^, as well as a species-based approach, which we 
will refer to as SB-a^. The central task in this article involves 
modifying these formulas for use in spatial simulations (see 

Sec. ins. 

4. Spatial T-leaping 

Spatial leaping approaches involve grouping events (reac- 
tion and diffusion) by subvolume, calculating a characteristic 
time interval T; leap for each subvolume and then choosing the 
global time step 

r = min {r!, oap }. (8) 
i'e{i...L} 

Every reaction and diffusion event can then fire multiple times 
within r. 

Marquez-Lago and Burrage l48ll attempted to generalize 
the NSM within the framework of such a leaping algorithm. 
The local time intervals r' eap are calculated using the RB- 
ao r-selection procedure of Gillespie and Petzold IU6I1 . mod- 
ified accordingly to apply to each subvolume. A binomial r- 
leaping variant l2lll is used for calculating event firings and 
provisions are made to segue to the NSM when the species 
populations are small. 

Rossinelli et al. l49ll presented a similar implementation 
of spatial r-leaping with the primary difference being that 
they considered reaction and diffusion events independently 
of each other. Interestingly, they did not provide provisions to 
segue to a SSA method in the limit of small populations. 

There are, however, some conceptual errors with both 
Marquez-Lago's and Rossinelli's spatial r-leaping methods. 
We aim address these concerns in Sec. Un]in our development 
of the SPLA and outline the differences between the three spa- 
tial leaping algorithms in Sec. MI El 

An important aspect of the spatial r-leaping algorithms is 
that, contrary to the exact-stochastic case (Sec. Ill A 2b . group- 
ing events by subvolume does not reduce the total number of 
calculations required in r selection. In the NSM, a charac- 
teristic time interval rf xact can be obtained for a given sub- 
volume via a single evaluation of Eq. ([TJ with a M (t) replaced 
by aio(t). Thus, L total calculations are required to deter- 
mine r. In the spatial r-leaping case, however, each T; lcap 
requires performing r-selection calculations for each reaction 
(RB-ao/RB-a^) or species (SB-a M ) in VJ. The total number 
of calculations required to determine r in this context thus far 
exceeds L. 



This is a fundamental difference between the approaches 
that complicates the incorporation of spatial SSA meth- 
ods like the NSM into a spatial leaping framework. In 
Sec. lIIIBl we present optimized pre-leap r-selection formulas 
for subvolume-based spatial r-leaping methods that minimize 
computational effort by only considering those events that di- 
rectly affect each reaction or species in Vi. In Sec.|V] we spec- 
ulate on alternative approaches that can fundamentally reduce 
the cost of r selection by allowing a single calculation to be 
performed for a group of events, analogous to the procedure 
employed in the NSM. 

III. THE SPATIAL PARTITIONED-LEAPING 
ALGORITHM (SPLA) 

A. Motivation 

A major concern with the Marquez-Lago and Burrage 
method is the exclusion of incoming diffusion events in the 
r-selection process. In the NSM, incoming diffusion can be 
ignored when selecting values of r because events outside of 
the subvolume have no bearing on when the next event within 
the subvolume will fire. In leaping methods, however, this is 
no longer the case: the relationships between events are of 
central importance in selecting values of r. Ignoring incom- 
ing diffusion in r selection is thus an error that may impact 
the accuracy and/or efficiency of the method to an a priori in- 
determinable extent. In Sec. |IV] we will show cases where 
this leads to inappropriately large values of r and, hence, in- 
creased error, and cases where it results in unnecessarily small 
values of r and decreased efficiency. Another concern in 
Marquez-Lago's method is the use of the RB-cto r-selection 
procedure which is not as theoretically sound as (and has been 
shown to be less accurate than) the RB-a M and SB-a^ proce- 
dures lll9l] . It appears that the RB-ao method was chosen to 
emulate the NSM. 

In the case of Rossinelli's method, the primary concern 
is the independent consideration of reactions and diffusion 
events during r selection. In principle, this is inappropriate 
because the firings of reactions are intimately related to the 
rates at which entities diffuse into and out of subvolumes, and 
vice versa. Ignoring this fact can introduce error and/or affect 
the efficiency of the method. Furthermore, the exclusion of 
a mechanism for transitioning to a exact-stochastic method in 
the limit of small populations introduces additional error, as 
shown in Sec.lIVI 



B. Spatial r selection 

In SPLA, we address each of the above issues: (i) both in- 
coming and outgoing diffusion are taken into account in the 
r-selection process, (ii) reactions and diffusion events are con- 
sidered together when selecting time steps, (iii) appropriately 
modified formulations of the RB-a M and SB-a^ r-selection 
procedures are used, and (iv) the method automatically segues 
to an exact-stochastic method (NRM) at low populations. 
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In general, the SPLA can be seen as an accurate, straight- 
forward implementation of spatial leaping against which fu- 
ture enhancements can be compared. The method was not 
intended to be faster than other spatial r-leaping methods, 
though this is a worthy goal, and, as we shall see, it often is 
not faster. In such cases, the advantage of using SPLA should 
be measured in terms of accuracy. Sometimes SPLA is faster 
than other methods because it produces larger time steps. This 
is particularly true for systems close to equilibrium where ne- 
glecting incoming diffusion can cause the algorithm to deter- 
mine that the leap condition Eq. (0 has been violated sooner 
than it actually has. 

As in previous implementations of spatial r-leaping, we se- 
lect time steps by calculating leap time intervals r' cap for each 
subvolume V/ and then setting r equal to the smallest of these 
[Eq. dH)]. In Table U we present a spatial version of the RB- 
a M r-selection procedure used in this article. In Table [TT1 we 
present the equations for the spatial SB-a^ procedure. We pay 
special attention to the ranges over which minimizations and 
summations are performed in these equations. In the RB-a^ 
case, one value of r'° ap is calculated for each of the M1+N1T1 
reaction and outgoing diffusion events in VJ. In the SB-a M 
procedure, one T ; 1 I eap calculation is required for each of the 
Ni species in V/. In Eqs. ( fT3b , ( fT4l i, ( fl9l ) and d20l i, summa- 
tions are taken over all Mi+2N{Ti events associated with Vi. 
This is necessary to take into account the effect of incoming 
diffusion and is critical for implementing an accurate spatial 
leaping algorithm. 



TABLE I: Spatial versions of the RB-a M r-selection formulas of 
Cao et al. fl9ll . as modified in Harris and Clancy IfJltl . One r^ ap 
calculation is required for each reaction and outgoing diffusion event 
in Vi. Note that in Eq. l !12t . af^ n is the smallest possible non-zero 
value of a; M (a™ n = c; M for elementary reactions). 

Spatial RB-a M 



leap 
T l = 



min {r;,r p } 



leap . J £lv(t) 6 ?mW 



|"i iM (t)| ' o 



l/i 



t) 



e; M (i) = max{ea; M (t),/3;^(t)} 

f outfall {%^}=0 

(aa hl (t)\ a . 
mm < fl ' > otherwise 
(se{i...N l -} I dx 'i I 

A/ i +2JV,r, 
M,+2N,r, 



(9) 

(10) 
(11) 

(12) 

(13) 
(14) 
(15) 



C. The algorithm 

We define a domain of constant volume and divide it into L 
(not necessarily equal-sized) subvolumes, each of volume uji, 
using a finite difference type discretization. A connectivity 
matrix C = {Ci, . . . , Cl} is used to specify the neighboring 
subvolumes and the geometry of the domain. Boundary con- 
ditions are applied (e.g., periodic, reflecting) by appropriately 
defining C. The SPLA then proceeds as follows: 

1. Initialization: 

(i) For each subvolume, Vf. Set initial populations 
X;(0) for all Ni local species and define M; 
reactions in which these species participate. Cal- 
culate initial values of the propensities {ai v (0)}, 
v E {1. . .Mi + NiTi} for all reactions and outgoing 
diffusion events. Set the time variable i = ii n it- 

(ii) Define global parameters e (^Cl), 'al' and '^>1' 
used in r selection and event classification (typical 
values are 0.01-0.05, 3 and 100, respectively Bill ). 

2. Calculate an initial (global) time step r [Eq. ((HJ] using 
either the RB-a M r-selection procedure of TableQ]or the 
SB-a M procedure of Table [TT1 

3. Classify all Mi+NiTi reaction and outgoing diffusion 
events within each Vi based on the values of a; M (t)r 



(see Sec. Ill B 2b . Prevent classification of diffusion 
events as exact-stochastic if the population of the dif- 
fusing species Xu(t) > 100 (see Sec. lHIDl i. 

4. For all events (newly) classified as exact-stochastic, 
generate values of rf^ act using Eqs. (HJ and/or (0. 

5. (i) If min{r^ act } < r, I' g {1. . .L}, v £ {all exact- 

stochastic events}, set r = mmjr™ *} and return 
to step [3] (this may require multiple iterations; see 
JH). 

(ii) Else, if minjr^ *} > r and all events are classi- 
fied as exact-stochastic, set r = minjr^ '} (no 
iterations required). 

(iii) Else, retain r. 

6. Determine the numbers of event firings {&;;'„ (r)}, /' € 
{1. . .L}, v e {1. . .Mi+NiTi}, based on the classifica- 
tions. For the three coarsest descriptions, Eqs. (Uji-© 
are used, respectively ll57ll . For exact-stochastic events, 
if r^ act =r then fc; M (r) = 1, otherwise zero. 

7. Fire all events and update populations. 

8. If any Xu(t + T) < 0, revert all populations to then- 
previous values, determine the numbers of event firings 
within the shorter time interval [t+r/2) as {fc;/„(r/2) = 

B{k Vv {r), 1/2)}, V e {i. . .L}, v £ {l. . Jif,+jv,r,}, 
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TABLE II: Spatial versions of the SB-a M r-selection formulas of 
Cao et al. flf^l . One T H ea,p calculation is required for each species 
in Vi. Note that in Eq. l !18b . the parameter gu depends on the types 
of events species Su participates in. See 1131 for formulas applicable 
to elementary event types, 13111 for simplified versions of these, and 
1 56] for extensions to select non-elementary events. 

Spatial SB-a M 

^'tAf^ (16) 
^P = min /^L,£|^\ (17) 

eu(t) = max{eXu(t)/gu, 1} (18) 
(0 < g u < oo) 

M l +2N,r l 

fnii(t) = ^2 zi vi a lv (t) (19) 

Afj+2AT,r i 

&ii(t) = 22 ^viO-iA 1 ) (2°) 



etrating too deep into the interior of the domain and signifi- 
cantly speeds the simulations with negligible loss in accuracy. 
Our choice of 100 as the threshold is based on the fact that dif- 
fusion is usually modeled as a first-order process and, hence, if 
the population is 100 then one firing will result in a 1 % change 
in the propensity. 1 % is a reasonable value for e and is at the 
lower end of the typical values that we use. Nevertheless, this 
approach is clearly ad hoc and it would be preferable to have a 
more general strategy that applies globally to all event types, 
not just diffusion events. In the future, we hope to develop 
such an approach. For the sake of demonstration, however, 
we believe that this simple strategy suffices. 

In step [8] of the SPLA, we employ the post-leap checking 
procedure of Anderson 12811 . which is theoretically stronger 
than the "try again" approach employed in (step 8 of) the orig- 
inal PLA Il3lll and in r-leaping fla, fl9tl . However, would 
like to emphasize that in the SPLA, we make minimal use 
post-leap checking and only to handle those rare occasions 
in which negative populations arise. Post-leap checking has 
much broader potential as an alternative r-selection approach 
that can improve the efficiency of the SPLA, either on its own 
or coupled with the reaction-based or species-based proce- 
dures of Tables HI and HIl 



where B(n,p) is a binomial random deviate with n at- 
tempts and a success probability of p (post-leap check- 
ing Jll]; see Sec. IIIIDI) and set r = r/2. Return to 
step |7] 

9. Advance the time to t + r and return to step [2] unless 
stopping criterion has been satisfied. 



D. Technical issues 

In step[3]of the SPLA, we include a provision that diffusion 
events should not be classified as exact-stochastic if the pop- 
ulations of the diffusing species are greater than 100. This is 
a somewhat arbitrary restriction that deserves explanation. In 
our initial trials, we often obtained time steps much smaller 
than expected, significantly diminishing the efficiency of the 
method, sometimes to a level close to that of the NRM. We 
identified the source of this problem as diffusion events at the 
leading edge of diffusing fronts. In these regions, the numbers 
of diffusing molecules are small and, as such, diffusion events 
obtain exact-stochastic classifications. In many instances, the 
values of rf^ act generated for these events were smaller than 
t, requiring a reduction in the time step and a reclassifica- 
tion of all events [stepQi) above]. This often led to events 
in subvolumes away from the leading edge being classified as 
exact-stochastic that previously were not, which would then 
produce an even smaller time step, and so. This "classifica- 
tion cascade" ultimately resulted in values of r much smaller 
than necessary. The same behavior was observed in a previ- 
ous application of the PLA to a model biological system f56l 
note 80]. 

The provision in step [3] of the SPLA was included in order 
to overcome this problem. It prevents the cascade from pen- 



E. Marquez-Lago, Rossinelli and some SPLA variants 

In order to assess the performance of the SPLA, we im- 
plemented Marquez-Lago's and Rossinelli's spatial r-leaping 
methods for comparison, as well as variants of the SPLA that 
incorporate select features of those methods for diagnostic 
purposes. Marquez-Lago's method differs from the SPLA in 
two important ways: (i) it calculates values of r'^ ap using the 
RB-ao r-selection procedure 



leap 



f 2 a 2 



(t) 



ea; (t) 
N„(*)l' crUt) 



(21) 



where a;rj(i) = YlwLi lT> a iv(t), and (ii) incoming diffusion 
is ignored in these calculations. The latter means that mi^{t) 
and erf (t) are calculated as in Eqs. (fT~3T > and (fl4] > of Table J] 
but with the summations running over v € {1. . .Mi + NiTi} 
only. Values of r ; lcap are calculated using Eq. (O of Table Q] 
and r is selected as in Eq. 

Marquez-Lago's method also transitions to using an exact- 
stochastic method in Vi if aio(t)r < 10. This amounts to 
classifying the subvolume as exact-stochastic which, in turn, 
experiences either one event firing within r or none at all. 
If one event fires, then event selection is performed as in 
the direct method. Consequently, if all subvolumes are clas- 
sified as exact-stochastic, then the algorithm becomes the 
NSM l40l l4lll . The numbers of firings within non-exact- 
stochastic subvolumes are determined using a binomial r- 
leaping variant 1I2TI1 . Importantly, the method does not use 
the continuum descriptions Eqs. (O and (|6]l that are used in 
the SPLA. Note that, instead of binomial r-leaping, we use 
standard Poisson r-leaping coupled with the negative popula- 
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tion check of step|8]of the SPLA. We consider this difference 
to be inconsequential in comparing the methods. 

The primary differences between the SPLA and the 
Rossinelli's spatial r-leaping method are: (i) they apply the 
SB-a M r-selection procedure of Table [Hi separately to reaction 
and diffusion events and, (ii) they do not provide a mecha- 
nism for segueing to a exact-stochastic method in the limit of 
small populations. For each subvolume Vi, rj eap values are 
calculated by 

r, loap =min{T ( rxn ,7f ff }. (22) 

with t? xn and r ; dlff being time steps for reactions and diffusion 
events respectively. They are obtained using modified forms 
of Eq. ( [ToT l in Table HI1 Basically, for each Su, two values of 
T, 1 i eap are calculated, one considering only reactions and the 
other only diffusion events (outgoing and incoming). These 
are obtained via Eq. (fTTI i of Table HH with fhu(t) and Sfj(t) 
calculated using Eqs. (T% and d20l i. respectively, but with the 
summations running only over 2/£{l. . .Mi} for r™ 1 and 
{Mi+l. . .Mi+2N[T[} for rf is . Thus, in our implementation 
of Rossinelli's method, we replace step [2] of the SPLA with 
this r-selection procedure. We also eliminate steps |3]-|5]of the 
SPLA and use only Eq. (|4]i in step|6](i.e., no exact-stochastic, 
Langevin or deterministic classifications). 

Finally, we also implement three variants of the SPLA: (i) a 
"one-way diffusion" variant that sums only over v S { 1. . .Mj+ 
NiTi} in Eqs. Q3) and (O of TableUand Eqs. <[T3) and © 
of Table HI1 during r selection [step|2]of the SPLA], (ii) a "no 
ES reactions" variant that prevents reaction events from be- 
ing classified as exact-stochastic in step [3] of the SPLA, and 
(iii) a "no ES events" variant that prevents all events (reac- 
tion and diffusion) from being classified as exact-stochastic. 
The first variant allows us to quantify the effects of ignoring 
incoming diffusion in r selection. The last variant gives us 
insight into the importance or tradeoff of transitioning to a 
exact-stochastic method in the limit of small populations. The 
second variant is used to exemplify the need for a more gen- 
eral strategy to address the classification cascade problem dis- 
cussed in Sec. HHP I These variants provided us with insight 
into the operation of the SPLA and allowed us to make con- 
nections to Marquez-Lago's and Rossinelli's methods. 



IV. NUMERICAL EXAMPLES 

In order to demonstrate the utility of the SPLA, we apply 
the method to three classical spatial systems: pure diffusion 
in one dimension (Sec. |IV At , the one-component reaction- 
diffusion system described by Fisher's equation f58l l59ll in 
one dimension (Sec. lIVBl , and the two-component reaction- 
diffusion system described by the Gray-Scott equations ll60Tl 
in two dimensions (Sec. |IV Ct . In all cases, we consider the 
domain partitioned into L equally-sized subvolumes. Diffu- 
sion is modeled as a first-order elementary process 

Su — h Sfi, (23) 



where Vp is an adjacent subvolume (i.e., Vi> € C;) and the 
microscopic diffusivity d% is constant throughout the domain. 
Propensities for diffusion events are thus of the form 

ai„(t) = diX H {t), /i e {Mi + 1. . .Mi + NiFi}. (24) 

Microscopic diffusivities are obtained from macroscopic dif- 
fusion coefficients Di via the relation l42ll 

dt = D./h 2 , (25) 

where h is the side length of the regular subvolumes. We 
choose h such that the size of the subvolume is less than the 
diffusion length of the system, given by V ADt (where D is 
diffusivity and r is the time step). However, the time step r 
can vary significantly during the course of the simulation. It 
is affected by the rate of diffusion, which in turn is affected by 
the subvolume size (Ref. Eq. d25ll). Hence this formula can 
only be used approximately. This circular dependency can be 
partially addressed by running a sample simulation, taking the 
most-frequent time step and then using that to calculate the 
subvolume size such that the well-mixed assumption is main- 
tained. All SPLA simulations are performed with e = 0.01, 
'wl' = 3and 1 >1' = 100. 



A. Pure diffusion 

The first system we considered was pure diffusion of a 5 
function in one dimension. Apart from being the simplest ex- 
ample of a diffusing front, this system is ideal for study be- 
cause analytical solutions are well known and the stochastic 
mean corresponds to the deterministic solution. 

We define a one-dimensional domain of width 0.4 m (in 
say, the y-direction) and cross-sectional area A and divide it 
into L = 40 equally-sized subvolumes, each of width 0.01 m 
(bJi = 0.01.4 m 3 ). We populate one subvolume at the center of 
the domain [see Fig. [T| with between X(0) = 1 and 5 x 10 7 
particles of species S and then vary A in order to maintain a 
constant concentration of 0.04 M over the whole domain. We 
apply Neumann (no flux) boundary conditions at each end of 
the domain, and define a constant (y-directional) diffusion co- 
efficient D = 10~ 3 cm 2 /s. The system can then be represented 
by the set of transformations 

S^Si+u Ze{l...L-l}, (26) 

d 

where d is obtained from Eq. d25l l. The partial differential 
equation that describes this system in the deterministic limit 
is 

?*m . D **p>. ( 27, 

In Fig. Q] we compare particle distributions at t = 2 s for an 
initial S spike of 1000 particles obtained from a representa- 
tive SPLA simulation of ( f26b and from Eq. ( f27] >. The results 
coincide well, although the effects of stochasticity are clearly 
visible. 
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FIG. 1: Particle distributions at t = 2 s for pure diffusion of a 1000 
particle 8 spike obtained using the SPLA and Eq. ( 127b . The initial 
delta function is shown as a dashed line. The cross-sectional area A 
is set such that the total concentration over the domain is 0.04 M. 
The SPLA simulation was performed using the SB-a M r-selection 
procedure of Table|n](SPLA-SB). 



In Fig. |2] we present a computational cost analysis compar- 
ing the SPLA to the NSM. In Fig. Eta), we see that the SPLA, 
using both the reaction-based (SPLA-RB) and species-based 
(SPLA-SB) r-selection procedures of TablesHlandHIl requires 
almost exactly the same number of simulation steps as the 
NSM up to about 1000 total particles. Beyond that, we see 
a significant difference between the methods, with the cost of 
the SPLA decreasing with increasing number of particles and 
that of the NSM continuing to increase linearly. The reason 
why the two SPLA methods coincide exactly is because we 
model diffusion as a first-order elementary process [Eq. d23l)l. 
Thus, the constraint on | Aa/ M (t) | used in reaction-based r se- 
lection is identical to that on |AXh(£)| used in species-based 
r selection. 

In Fig. EJb), we compare the CPU times for each of the 
three methods. Here, we see that up to about 1000 total par- 
ticles the NSM is actually the least expensive of the methods. 
The SPLA-SB is close behind, however, being slightly less ef- 
ficient because of the computational overhead associated with 
r selection. Beyond 1000 total particles, we see that the SPLA 
decreases in computational cost while the cost of the NSM 
continues to increase linearly. Interestingly, SPLA-RB is sig- 
nificantly less efficient than the SPLA-SB, despite the fact that 
both methods take the exact same number of steps on average. 
This is due to two factors: (i) the total number of r leap calcula- 
tions required in reaction-based r-selection (Mi + NiTi = 78) 
as compared to species-based (Ni = 40), and (ii) the extra 
expense associated with calculating rate derivatives in RB r- 
selection [Eqn. Q3) . Since Ni will often be much less than 
AIi+NiTi, we see that there is a distinct advantage to using 
SB r-selection in spatial leaping simulations. 

In Fig. |3] we compare the accuracy of the SPLA-SB to 
the NSM for an initial S spike of 10 4 particles. We omit the 
SPLA-RB since the results are identical to the SPLA-SB. We 
see that, although the SPLA requires about an order of magni- 
tude fewer steps [Fig. Eta)], there is essentially no difference 
between the means and standard deviations obtained from 
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FIG. 2: (a) Average numbers of simulation steps and (b) average 
CPU times vs. total particle number for pure diffusion of a S function 
till t = 2s using the SPLA-RB, SPLA-SB and NSM. In each case, 
the particle number was changed by varying the cross-sectional area 
A, while maintaining a constant concentration of 0.04 M over the 
domain. All results are averaged over 500 simulation runs performed 
on an Intel Core 2 Duo, 2.13 GHz machine with 2 GB of RAM. 



both methods over the entirety of the domain. We make sense 
of this by referring to the works of Cao et al. 16111 and Rathi- 
nam et al. ll62ll . both of which show that in explicit r-leaping 
methods (like SPLA), for sufficiently small r, the histograms 
generated using a r-leaping method should be virtually in- 
distinguishable from those obtained using an exact-stochastic 
method. Our results in Fig.[3]thus simply indicate that we are 
using a small enough error control parameter (e = 0.01) in 
r selection and thus avoiding any noticeable errors. 



B. Fisher's equation 

Fisher's equation (also known as the Fisher-Kolmogorov- 
Petrovskii-Piscounov equation) 115 8i 15911 is a deterministic par- 
tial differential equation that has been used to describe the 
propagation of an advantageous gene in a population 1158m and 
the spatio-temporal evolution of a species under the combined 
effects of diffusion and logistical growth l59ll . In one dimen- 
sion, the equation is of the form 



du 



Ku(c 



. „<9 2 u 



(28) 



9 




0.0 0.1 0.2 0.3 0.4 



Position (m) 

FIG. 3: Means and standard deviations of the particle number over 
the entire domain at t = 2 s for pure diffusion of a 10 4 particle 5 
function using the SPLA-SB and the NSM. In both cases, results are 
from 500 simulation runs. The dotted lines constitute an envelope of 
twice the standard deviation about the SPLA-SB mean. 
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FIG. 4: Solution of the one-dimensional Fisher's equation d28t . The 
initial condition is shown by the dashed lines. The deterministic tra- 
jectory (blue) is shown at t = 3.73 s, the time at which the solution 
reaches half its saturation value at y — 0.2 m. A stochastic trajectory 
(red) is shown at t — 5.0 s 



where u is a species concentration, K is a second-order reac- 
tion rate constant, c is the "carrying capacity" or "saturation 
value" of the system and D is a diffusion coefficient. 

We again consider a one-dimensional domain of width 
0.4 m and cross-sectional area A and divide it into L = 40 
equally-sized subvolumes (coi = 0.01^4 m 3 ) with Neumann 
(no flux) boundary conditions applied at each end. On this 
domain, we consider the reaction-diffusion system 

S n + Si2^2Sn, le{l...L}, 
Su£S(i+i)i, l€{l...L-l}, (29) 

d 

Sa - <% +1)2 , Ze{l...L-l}. 

d 

Because S\ and £2 have equal diffusivities throughout the 
domain, in the deterministic limit the total population within 
each subvolume Xit = Xn (t)+Xi2(t) is constant. The spatio- 
temporal evolution of Si can thus be described by Fisher's 
equation (|28]) with d = D/h 2 , u = Xi(y,t)/Q,, K = kQ, 
c=Xix/£l, where = N^uji and Na is Avogadro's number. 

Initially, we take the first compartment to be saturated 
with Si [i.e, Xn(0) = cil; see Fig. [4) and all other com- 
partments to be saturated with The saturation value c 
is taken be 10~ 4 M and we choose D = 10 -4 m 2 /s and 
K = 7x 10 4 M _1 s _1 . To investigate the effects of stochasti- 
city, we hold c constant and vary the particle number Xit by 
varying the cross-sectional area A. In Fig. [4] we show a snap- 
shot of the traveling wave of Si obtained by solving Fisher's 
equation (l28l i and the corresponding reaction-diffusion system 
d29). 

In Fig. [5] we show a computational cost analysis comparing 
different variants of the SPLA-SB to the NSM and Marquez- 
Lago's and Rossinelli's spatial r-leaping methods. [We do not 
consider the SPLA-RB since it is significantly less efficient 
than the SPLA-SB]. Simulations are run until t = 25 s, and the 
results are averaged over 500 runs. In Fig. |5ja), we see that, 
at low populations, the numbers of simulation steps for all 



methods scale linearly with the number of particles, although 
the Rossinelli method and SPLA-(no ES events) take an or- 
der of magnitude fewer steps. This indicates that these meth- 
ods are firing multiple events even when the populations are 
small. Above about Xit = 100, however, we see a divergence 
from the linear trend for all of the leaping methods. The cost 
of the Marquez-Lago and Rossinelli methods are independent 
of system size above this point, while that for the full SPLA 
drops initially, but then continues to increase linearly beyond 
about Xit = 500. However, when we selectively disable the 
exact-stochastic classification for reactions only in the SPLA 
we see that the cost decreases significantly, approaching those 
of Marquez-Lago and Rossinelli. This indicates that, for this 
system, reaction events are causing a classification cascade 
at large populations just as diffusion events did in our initial 
studies. This exemplifies the need to develop a more general- 
ized approach for handling the classification cascade problem 
(see Sec. IIIIDt . In Fig. |5jb), we see similar trends for the 
CPU times, although Marquez-Lago's method and the SPLA 
are somewhat more costly than the NSM at small populations 
because of the added overhead associated with r selection. 

The results in Fig. [5] would seem to indicate that the SPLA 
is always slower than both the Marquez-Lago and Rossinelli 
methods. However, this is not entirely true. In Fig. [6] we show 
the time steps taken during representative simulation runs with 
Xit = 10 4 for the various methods. We see that during the first 
~ 7 s, when the wave is propagating across the domain, the 
time steps for the SPLA are small. Rossinelli's method takes 
slightly larger time steps during this period while Marquez- 
Lago's time steps are significantly larger. The scatter of par- 
ticularly small time steps for the full SPLA exemplifies the 
classification cascade effect evident in Fig. [5] We also see how 
forbidding the exact-stochastic classification for reactions pre- 
vents this from occurring. At ~ 7 s, however, the situation 
changes dramatically. The system approaches equilibrium and 
the time steps for all leaping methods increase significantly, 
with Rossinelli's method experiencing the largest jump, fol- 
lowed by the SPLA and then Marquez-Lago. Also note how 
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FIG. 5: (a) Average numbers of simulation steps and (b) average 
CPU times vs. Xit for simulations till t — 25 s of the reaction- 
diffusion system J29t using various methods. In each case, the par- 
ticle number is changed by varying the cross-sectional area A while 
maintaining a constant concentration of c within each Vi . All results 
are averaged over 500 simulation runs performed on an Intel Core 2 
Duo, 2.13 GHz machine with 2 GB of RAM. 
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FIG. 6: Time steps calculated during individual simulation runs of 
the reaction-diffusion system l |29t with Xit = 10 4 using various 
methods. 



the classification cascade problem ceases in the SPLA. 

We make sense of these results by considering the one-way 
diffusion variant of the SPLA (Sec. 1111 El l, where incoming 
diffusion is ignored in r selection. We see in Fig. [6] that this 
results in significantly smaller time steps for t > 7 s. The time 
period after ~7s corresponds to the equilibrium state of the 
system, i.e., it takes ~ 7 s for the wave to travel across the 
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FIG. 7: Percent deviations between mean wave velocities obtained 
from various leaping methods and the NSM for varying system 
sizes «V) lcap -(V) NSM / (V) NSM x 100 %). (In set) Convergence of 

(V) NSM to the analytical solution V = 2 V DKc l63ll with increasing 
system size. All results are based on 500 leaping and NSM simu- 
lation runs. Note that the apparent discrepancies between the NSM 
and the SPLA and Marquez-Lago methods at small particle numbers 
are simply due to random sampling error [also see Fig.[8]c)]. 



domain. At equilibrium, incoming diffusion replenishes the 
numbers of particles in subvolumes. Ignoring this causes the 
algorithm to underestimate the time at which the leap condi- 
tion Eq. (0 will be violated. This explains why the time steps 
for Marquez-Lago's method are smaller during this phase than 
other leaping methods and, if we were to run the simulations 
longer than 25 s, SPLA would become more efficient. The 
remaining disparity between Marquez-Lago and the one-way 
SPLA is due to differences in r-selection procedure. The 
larger time steps for Rossinelli during this phase are due to 
their separate consideration of reaction and diffusion events. 

We find that the different time steps obtained by various 
methods give rise to different traveling wave velocities V, 
which we can use to compare the accuracies of the various 
methods. We measure the velocity as the time taken for S\ 
to reach half its saturation value at y = 0.2 m. For the Heav- 
iside initial condition, the analytical expression for the wave 
velocity is V = 2y 'DKc l63tl . However, stochastic effects 
give rise to a distribution of wave velocities for the same ini- 
tial condition. Recent authors have shown that, depending on 
the values of c and K, the mean of the velocity distribution 
can differ from the analytical velocity ll64l - l66ll . particularly at 
low populations. Thus, instead of the analytical solution, we 
use the mean velocity (V) NSM obtained from 500 NSM sim- 
ulations as the standard for comparison. In Fig. [71 we show 
percent deviations between the mean wave velocities obtained 
from the various leaping methods and (V) NSM as a function 
of Xit- In the inset, we show the convergence of (V) NSM to 
the analytical solution with increasing number of particles. 

Fig. |7J shows that, for small populations, Rossinelli's 
method has large errors in its wave velocity. The error de- 
creases with increasing population and becomes negligible at 
the largest system sizes considered. Marquez-Lago's method, 
on the other hand, shows the opposite trend: the error is negli- 
gible at small populations and increases with increasing pop- 
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ulation. We can explain these observations by referring back 
to Figs. [5] and [6] The error in Rossinelli at small populations 
is mainly due to the fact that the method lacks a mechanism 
for transitioning to a SSA method. Thus, the increase in effi- 
ciency seen in Fig.[5]comes at the cost of accuracy. The error 
in Marquez-Lago's method is due to the combined effect of 
the r-selection procedure and neglecting incoming diffusion. 
At small populations, the method transitions to NSM, thus re- 
ducing error. At large populations, however, the method takes 
larger time steps than the other leaping methods during the 
wave-propagation phase (Fig.[6]i, resulting in the increased er- 
ror seen in Fig. [7] Furthermore, in the equilibrium phase (after 
~ 7 s), the method takes smaller steps than SPLA and the error 
then changes from one of accuracy to one of efficiency. SPLA 
addresses each of these issues and shows negligible error over 
the entire population range. SPLA-(one way) tries to capture 
just the effect of neglecting incoming diffusion. However, 
we do not observe any significant error because the method 
transitions to an exact-stochastic method at small populations 
and, calculates leap time steps that are fairly near the accurate 
SPLA time steps at higher populations. 



While the means of velocities are instructive in providing 
general insight into the accuracies of the methods, they do 
not give complete information about the particle distributions 
over the entire domain. Thus, for a more accurate analysis, 
we use the Kolmogorov-Smirnov test 16711 . a statistical test 
used to compare two given distributions. If the particle num- 
ber distribution for Si within a given subvolume Vi at time t is 
P(Xn(t)) for a leaping method and P(X n (t)) for the NSM, 
then the Kolmogorov distance between the two distributions is 
defined as tC(Xn(t)) = max \F(X n (t))-F(Xii(t))\, where 
F(x) = f_ P{x)dx is the cumulative distribution function 

of P(x). The reference distribution P(Xn(t)) is also asso- 
ciated with a "self distance" S(Xn(t)) J67], which is a mea- 
sure of the uncertainty associated with building the distribu- 
tion from a finite set of realizations. Only if JC{Xn(t)) > 
S(Xn(t)) can we say that the two distributions are statisti- 
cally distinct. 



In Fig. M we plot the differences K.(X n (5)) - S(X n (5)) 
over the entire domain I e {1. . .L} obtained using the full 
SPLA and the methods of Marquez-Lago and Rossinelli for 
various values of Xyr- A positive value of this difference 
indicates regions where the solution obtained from the vari- 
ous leaping methods differs, in a statistically significant sense, 
from the NSM. These plots reinforce the observations made 
above: (i) errors arise in Marquez-Lago at large populations, 
(ii) errors arise in Rossinelli at small populations, and (iii) 
the full SPLA is accurate over the entire domain for all sys- 
tem sizes considered. Moreover, we see that the errors in 
Figs.[8ja)-(c) arise mainly at the propagating wavefront. 



C. Gray-Scott equations 



First studied by Pearson [60], the Gray-Scott equations 

du 



- -uv 2 + F(l - u) + D u V 2 u, 
at 

dv o ,„ o 

— = uv 2 - (F + k)v + D v V 2 v, 



(30) 



describe the spatio-temporal behavior of a two-component 
reaction-diffusion system. The equations are of particular in- 
terest because they produce a rich variety of spatio-temporal 
patterns based on the values of F and k. Here, we set F = 
0.035, k = 0.060, D u = 2xl0" 5 m 2 /s and D v = 10~ 5 m 2 /s. 

We consider a two-dimensional domain of width 0.5 m and 
length 0.5 m (in say, the y-z plane) and height H and divide 
it into a regular 50 x 50 grid (L = 2500; w; = 0.25iJ m 3 , fl = 
Naoji) with periodic boundary conditions. On this domain, 
we consider the two-component reaction-diffusion system 

S n + 2S l2 3Si2, le{l...L}, 

k2 „ 



Si 2 



di 

Sn ^ Si' i, 

di 

Sl 2 Sl'2, 
d2 



le{i...L], 

l€{l...L}, 

le{i...L}, l'eC h 

le{i...L}, I'eCi, 



(31) 



If we set the parameters ki = 1/il 2 s _1 , k 2 = F s _1 , k-2 = 
Ffl s _1 and k% = F + k s _1 , then in the deterministic limit 
the spatio-temporal evolutions of Si and S2 are described by 
the Gray-Scott equations ( f30b with u = X 1 (y, z, t)/fl, v = 
X 2 (y,z, t)/fi, di = D u /h 2 and d 2 = D v /h 2 . 

In the deterministic case, a unique pattern is obtained from 
Eqs. d30b for a give n set of parameters {F,k, D Ul D v } and ini- 
tial conditions 16011 . However, the pattern formation behavior 
can change significantly in the presence of noise l68ll . to the 
extent that large amounts of internal noise can prevent pattern 
formation altogether [69[]. We investigate the effects of noise 
in the Gray-Scott system by performing stochastic simulations 
using the SPLA-SB, Marquez-Lago and Rossinelli methods, 
but we consciously choose conditions that minimize stochas- 
tic effects so that direct comparisons can be made to the de- 
terministic solution. 

Initially, we set the concentration of Si and S2 in each sub- 
volume to 1 M and 0.1 M respectively and choose H such 
that 1 M corresponds to to 5000 particles. We then apply a 
perturbation that triggers pattern formation in the reaction dif- 
fusion system. In Fig. [9] we show snapshots of the patterns 
obtained from the different simulations methods and from the 
solution of Eqs. (f30t at t= 1500 s. 

By comparing Figs. |9ja)-(c) to Fig. |9£d), the effects of 
noise are visually evident. Rather than the smooth pattern pro- 
duced in the deterministic case, those obtained from the leap- 
ing methods have clear fluctuations. The effects are small, 
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FIG. 8: Kolmogorov distances at t = 5 s obtained using: (a) Marquez-Lago, (b) Rossinelli, and (c) SPLA. The x-axis corresponds to position 
within the domain and the y-axis to system size (i.e., Xit)- All results are based on 500 leaping and NSM simulation runs. The NSM self 
distance is subtracted from the Kolmogorov distances and negative values are clipped to zero. The black line is the mean position of the 
wavefront for different system sizes. We can infer from this that the simulation error arises mainly at the propagating wavefront. 




FIG. 9: Snapshots of the Gray-Scott reaction-diffusion system J3U 
at t= 1500 s obtained using (a) Marquez-Lago, (b) Rossinelli, (c) the 
full SPLA-SB, and (d) Eqs. The concentration of Si (plotted 
above) ranges from (blue) to 1 (red) M. The features present in 
regions I, II and III are compared for different simulation methods. 
All simulations are performed with the parameters F = 0.035, k = 
0.060, D u = 2 x 10~ 5 m 2 /s and D v = 10" 5 m 2 /s. 



however, and all of the patterns are superficially similar, al- 
though close inspection reveals perceptible differences. We 
highlight three regions (I, II and III) in the deterministic solu- 
tion of Fig. |9]to make visual comparison of the patterns easier. 
The pitchfork-type pattern in region II is present in SPLA and 
to some extent in Rossinelli's method. That feature is barely 
recognizable Marquez-Lago 's method. Similarly in region I, 
SPLA's pattern is the closer to the deterministic solution than 



other methods. However, the patterns present in region III are 
similar in all the simulation methods. From this we argue that 
the SPLA pattern in Fig.[9|c) is most similar to the determin- 
istic solution in Fig. |9jd) and that the Marquez-Lago pattern 
in Fig. |9{a) is most dissimilar. Since we consciously aimed 
to minimize the effects of stochasticity in the pattern forma- 
tion, these results imply that the most faithful description of 
the system dynamics is given by the SPLA. 

In order to ascertain why this is, we compare the time 
steps taken by the SPLA to those for the Marquez-Lago and 
Rossinelli methods. The main result (plot not shown) is that 
the full SPLA generally takes smaller time steps than the other 
methods, explaining why it gives more accurate results. 

It is important to note that our analysis of the Gray-Scott 
system is limited due to the large number of total events in 
the system. With 2500 subvolumes, each with four nearest 
neighbors, there are a total of 10 4 reactions and 2x 10 4 diffu- 
sion events that must be taken into account. As such, a single 
SPLA simulation of 1500 s took 1.43 h to complete. Marquez- 
Lago and Rossinelli simulations took a comparable amount of 
time (0.36 h and, 0.92 h respectively). This is an important re- 
sult because it exemplifies a serious shortcoming of the spatial 
r-leaping approach in general. Although leaping is beneficial 
in allowing multiple event firings at each simulation step, the 
high cost of r selection severely limits the applicability of the 
approach in the face of large event numbers, as is common in 
spatially-discretized systems. Thus, in order to make the ap- 
proach practicable, improving the efficiency of the method is 
of paramount importance. We discuss this issue in more detail 
in Sec.[V] 



V. DISCUSSION AND CONCLUSION 

We have presented the spatial partitioned-leaping algo- 
rithm as an accurate formulation of the leaping approach 
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for reaction-diffusion systems on discretized grids. Our pri- 
mary contributions have been to correctly enumerate all of 
the events that must be considered during the time-step cal- 
culation process and to recast the reaction-based and species- 
based r-selection formulas lfl9l l3~Tll within a spatial context 
(Tables Q] and [U}. The main differences between these for- 
mulas and those used in prior implementations of spatial r- 
leaping II48[ 14911 are that reaction and diffusion events are 
considered together and the effects of incoming diffusion are 
properly taken into account. Both aspects are crucial for an 
accurate spatial leaping implementation and we have shown, 
through numerical examples, how improper consideration can 
lead to the introduction of error or a reduction in efficiency, 
depending on the specifics of the system being studied. We 
have also shown the implications, in terms of accuracy, of not 
providing a mechanism for transitioning to a exact-stochastic 
method in the limit of small populations. 

Furthermore, we have shown that the species-based r- 
selection procedure, besides being inherently less costly per 
calculation than the reaction-based procedure (because of the 
lack of rate derivatives lO), will generally require far fewer 
total calculations for spatial systems than the reaction-based 
approach. This is because the total number of events in a 
discretized system will often far exceed the total number of 
species. Species-based r-selection will thus be the preferred 
choice in most situations. Exceptions include cases where rate 
constants are time dependent, e.g., if environmental quantities 
such as temperature and volume vary in time (species-based 
r-selection assumes time-invariant rate constants). In such 
situations, modified forms of the reaction-based r-selection 
formulas will be required. 

Inclusion of the exact-stochastic classification in the algo- 
rithm brings along problems of classification cascade, induced 
by events at the edge of diffusing fronts, which eventually re- 
sults in an unnecessarily small time step and, hence, a sig- 
nificant reduction in efficiency. This phenomenon has been 
observed previously for a well-mixed biochemical system in- 
volving binding of transcription factors to individual genes 
|[56ll and is a shortcoming of the PLA in general. Here, we 
have attenuated this problem to an extent by restricting the 
classification of diffusion events as exact-stochastic when the 
population of the diffusing species exceeds a pre-specified 
threshold (i.e., 100). This approach is ad hoc, however, and 
we have demonstrated the need to develop a more general ap- 
proach that can handle all cases. Work is currently underway 
in this direction. 

A shortcoming of the SPLA, and other spatial leaping meth- 
ods in general, is the strong dependence of the computational 
cost on the total number of events in the system. This is a well- 
known problem for stochastic simulation algorithms lf/0ll and 
is exemplified by the large amount of time taken to analyze 
the moderately complex Gray-Scott system (involving 3x 10 4 
unique events involving 5000 unique species). These methods 
remain constrained by the fact that one r-selection calculation 
must ultimately be performed for each event (reaction-based) 
or species (species-based) present in the system. In order to 
make the approach practicable, a solution to this problem is 
clearly required. A computational approach can be to par- 



allelize the algorithm, parsing out the computational effort 
across multiple machines. Many aspects of the SPLA are in- 
deed parallel in nature, such as r selection, event classification 
and event update. From an algorithmic perspective, Ander- 
son's post-leap checking procedure 112 811 may provide some 
relief in that it obviates the need to perform the expensive pre- 
leap calculations. Pettigrew and Resat l25ll have proposed an 
approximate post-leap checking procedure that might prove 
useful as well. 

Another possibility is to fundamentally reduce the number 
of r selection calculations by performing them on groups of 
events rather than on individual events or species. The chal- 
lenge, however, is that in contrast to the exact-stochastic case 
(Sec. Ill A 2\ . it is not permissible within the context of a leap- 
ing algorithm to group arbitrary sets of events and then per- 
form r selection on the group. This is because the leap con- 
dition Eq. ^ applies at the level of individual events, not 
groups. Basically, there is no guarantee that a given change in 
the summed propensity of the group will translate into equiv- 
alent changes in the propensities of the events that comprise 
the group. However, it may be possible to identify special 
types of groups in which this is, in fact, the case. This type 
of grouping, based on event type rather than on location, is 
fundamentally different from that used in typical spatial sim- 
ulation methods. It also differs from the type of grouping used 
in the multinomial r-leaping method of Pettigrew and Resat 
rf25ll . a well-known binomial r-leaping variant. We are ac- 
tively pursuing this avenue of research. 

Compounding the problem of exact-stochastic event classi- 
fications is that SPLA transitions to NRM, which is an ineffi- 
cient exact-stochastic method for spatial simulations. Ideally, 
the method would segue to an efficient spatial SSA formu- 
lation such as the NSM. However, the NSM, which is based 
on grouping events by subvolume, does not fit naturally into 
the framework of the SPLA for the reasons cited above, i.e., 
r selection cannot be applied at the level of groups. Marquez- 
Lago incorporate the NSM into their spatial r-leaping method 
by classifying subvolumes as exact-stochastic if aio(t) < 10 
(Sec. IIII El l, emulating the approach taken by Gillespie, Pet- 
zold and co-workers fl5[ [l6l [l9tl . We could employ a similar 
approach in the SPLA. However, it is our hope that a more 
natural method of transition will arise from our attempts to 
incorporate grouping generally into the leaping methodology. 

Our development of the SPLA is significant in that it rep- 
resents a "gold standard" in terms of accuracy against which 
future enhancements and extensions to the spatial r-leaping 
approach can be compared. As a straightforward implemen- 
tation of spatial leaping, the method is not maximally opti- 
mized in terms of efficiency nor is it meant to be. However, 
it does achieve the maximum possible gains in efficiency for 
a method that accurately employs pre-leap r selection at the 
level of individual events by considering only those events that 
are of consequence to the calculation. These include local 
reactions and outgoing and incoming diffusion events to and 
from neighboring subvolumes. We hope that future innova- 
tions addressing the challenges highlighted here will help to 
further improve the leaping methodology and make stochastic 
simulations of complex systems practicable. 
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